15 research outputs found

    Temporal Attention-Gated Model for Robust Sequence Classification

    Full text link
    Typical techniques for sequence classification are designed for well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, such methods cannot be easily applied on noisy sequences expected in real-world applications. In this paper, we present the Temporal Attention-Gated Model (TAGM) which integrates ideas from attention models and gated recurrent networks to better deal with noisy or unsegmented sequences. Specifically, we extend the concept of attention model to measure the relevance of each observation (time step) of a sequence. We then use a novel gated recurrent network to learn the hidden representation for the final prediction. An important advantage of our approach is interpretability since the temporal attention weights provide a meaningful value for the salience of each time step in the sequence. We demonstrate the merits of our TAGM approach, both for prediction accuracy and interpretability, on three different tasks: spoken digit recognition, text-based sentiment analysis and visual event recognition.Comment: Accepted by CVPR 201

    The Cambridge Face Tracker: Accurate, Low Cost Measurement of Head Posture Using Computer Vision and Face Recognition Software.

    Get PDF
    PURPOSE: We validate a video-based method of head posture measurement. METHODS: The Cambridge Face Tracker uses neural networks (constrained local neural fields) to recognize facial features in video. The relative position of these facial features is used to calculate head posture. First, we assess the accuracy of this approach against videos in three research databases where each frame is tagged with a precisely measured head posture. Second, we compare our method to a commercially available mechanical device, the Cervical Range of Motion device: four subjects each adopted 43 distinct head postures that were measured using both methods. RESULTS: The Cambridge Face Tracker achieved confident facial recognition in 92% of the approximately 38,000 frames of video from the three databases. The respective mean error in absolute head posture was 3.34°, 3.86°, and 2.81°, with a median error of 1.97°, 2.16°, and 1.96°. The accuracy decreased with more extreme head posture. Comparing The Cambridge Face Tracker to the Cervical Range of Motion Device gave correlation coefficients of 0.99 (P < 0.0001), 0.96 (P < 0.0001), and 0.99 (P < 0.0001) for yaw, pitch, and roll, respectively. CONCLUSIONS: The Cambridge Face Tracker performs well under real-world conditions and within the range of normally-encountered head posture. It allows useful quantification of head posture in real time or from precaptured video. Its performance is similar to that of a clinically validated mechanical device. It has significant advantages over other approaches in that subjects do not need to wear any apparatus, and it requires only low cost, easy-to-setup consumer electronics. TRANSLATIONAL RELEVANCE: Noncontact assessment of head posture allows more complete clinical assessment of patients, and could benefit surgical planning in future

    Estimation of accuracy of the trees and logs volume tables

    No full text
    Darbo objektas – pušų, eglių, beržų, alksnių medžių stiebai ir jų iš pagaminta apvaliosios medienos produkcija. Darbo tikslas – Išanalizuoti medienos tūrio skirtumus, kurie susidaro medienos apskaitai naudojant medžių stiebų su žieve tūrio, medžių tūrio struktūros ir rąstų tūrio lenteles. Darbo metodai – statistiniai, empiriniai. Darbo rezultatai. Atlikus tyrimus buvo gauna, kad stiebų tūrio lentelės vidutiniškai 2,2% didina visų medžių rūšių kartu paėmus stiebų su žieve tūrius. Stiebų tūrio lentelės vidutiniškai 4,4% didina pušų stiebų su žieve tūrį, ir 2,5% mažina juodalksnių stiebų su žieve tūrį. Tikrintų biržių imčių duomenys rodo, kad visų medžių rūšių kartu paėmus likvidikės medienos tūris, nustatytas pagal medžių tūrio struktūros lenteles, yra 3.5% padidintas. Vidutinis nelikvidinės medienos tūris bareliuose sudaro 15%. Rąstų tūrio lentelių tikslumas yra pakankamas apvalios medienos tūriui įvertinti. Tūrio skirtumo paklaida yra neesminė. Raktažodžiai: Stiebai, mediena, produkcija, tūrio skirtumai, žievė.Work object - stems of pinus, picea, alnus and betula and from these produced round wood production. Work goal - to compare stems capacity of the same trees, which is evaluate by trees stems with bark tables with capacity, which is evaluated by compound Huber formula, by structure of trees capacity tables and by capacity of logs tables. Work methods – statisticals, empiricals. Work results – after research was noticed that tables of stems capacity increases all kinds of trees capacity including stems with bark about 2,2%. Tables of stems capacity about 4,4% increase volume of pinus stems with bark and about 2,5% reduce volume of alnus stems with bark. The material of checked plots shows that all kinds of trees liquidated wood capacity, which was evaluated by tables of trees capacity structure is increased 3.5%. Medium not commercial wood capacity in areas contains 15%. Accuracy of logs volume tables is unbiased.Žemės ūkio akademijaVytauto Didžiojo universiteta

    Crowdsouring in emotion studies across time and culture

    No full text
    Crowdsourcing is becoming increasingly popular as a cheap and effective tool for multimedia annotation. However, the idea is not new, and can be traced back to Charles Darwin. He was interested in studying the universality of facial expressions in conveying emotions, thus he had to consider a global population. Access to different cultures allowed him to reach more general conclusions. In this paper, we highlight a few milestones in the history of the study of emotion that share the concepts of crowdsourcing. We first consider the study of posed photographs and then move to videos of natural expressions. We present our use of crowdsouring to label a video corpus of natural expressions, and also to recreate one of Darwin’s original emotion judgment experiments. This allows us to compare people’s perception of emotional expressions in the 19th and 21st centuries, showing that it remains stable through both culture and time. 1

    Automatic analysis of naturalistic hand-over-face gestures

    No full text
    One of the main factors that limit the accuracy of facial analysis systems is hand occlusion. As the face becomes occluded, facial features are lost, corrupted, or erroneously detected. Hand-over-face occlusions are considered not only very common but also very challenging to handle. However, there is empirical evidence that some of these hand-over-face gestures serve as cues for recognition of cognitive mental states. In this article, we present an analysis of automatic detection and classification of hand-over-face gestures. We detect hand-over-face occlusions and classify hand-over-face gesture descriptors in videos of natural expressions using multi-modal fusion of different state-of-the-art spatial and spatio-temporal features. We show experimentally that we can successfully detect face occlusions with an accuracy of 83%. We also demonstrate that we can classify gesture descriptors (hand shape, hand action, and facial region occluded) significantly better than a naïve baseline. Our detailed quantitative analysis sheds some light on the challenges of automatic classification of hand-over-face gestures in natural expressions
    corecore